A stratified traffic sampling methodology for seeing the big picture
نویسندگان
چکیده
This work explores the use of statistical techniques, namely stratified sampling and cluster analysis, as powerful tools for deriving traffic properties at the flow level. Our results show that the adequate selection of samples leads to significant improvements allowing further important statistical analysis. Although stratified sampling is a well-known technique, the way we classify the data prior to sampling is innovative and deserves special attention. We evaluate two partitioning clustering methods, namely clustering large applications (CLARA) and K-means, and validate their outcomes by using them as thresholds for stratified sampling. We show that using flow sizes to divide the population we can obtain accurate estimates for both size and flow durations. The presented sampling and clustering classification techniques achieve data reduction levels higher than that of existing methods, on the order of 0.1% while maintaining good accuracy for the estimates of the sum, mean and variance for both flow duration and sizes. 2008 Elsevier B.V. All rights reserved.
منابع مشابه
Importance sampling for speed-up simulation of heterogeneous MPEG sources
The ISO/ITU standard MPEG is expected to be of extensive use for the transfer of video/ moving pictures traffic in the coming ATM high capacity networks. This traffic will stem from both multimedia services like teleconferencing and video distribution. Hence, MPEG encoded video will be a salient constituent of the overall traffic. The encoding causes large and abrupt shifts in the transferred r...
متن کاملUrban network risk assessment using Fuzzy-AHP and TOPSIS in GIS environment
Risk assessment of urban network using traffic indicators determines vulnerable links with high danger of traffic incidents. Thus Determination of an appropriate methodology remains a big challenge to achieve this objective. This paper proposed a methodology based on data fusion concept using Fuzzy-AHP and TOPSIS to achieve this aim. The proposed methodology tries to overcome two main problems,...
متن کاملPreface to the Special Issue
Large-scale assessments, such as Trends in International Mathematics and Science Study (TIMSS) and Programme for International Student Assessment (PISA), provide big data in the educational context. A researcher who wants to conduct a secondary analysis using this big data has to notice that analyses of this type of data require considering some technical complexities. Among these complexities ...
متن کاملImplementation of Random Forest Algorithm in Order to Use Big Data to Improve Real-Time Traffic Monitoring and Safety
Nowadays the active traffic management is enabled for better performance due to the nature of the real-time large data in transportation system. With the advancement of large data, monitoring and improving the traffic safety transformed into necessity in the form of actively and appropriately. Per-formance efficiency and traffic safety are considered as an im-portant element in measuring the pe...
متن کاملProbe-based Arterial Link Travel Time Estimates for Its Applications
The use of probe vehicles to provide estimates of link travel times has been suggested as a means of obtaining travel times within signalized networks for use in advanced traveler information systems (ATIS). Previous research has shown that bias in arrival time distributions of probe vehicles will lead to a systematic bias in the sample estimate of the mean. This paper proposes a methodology fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Networks
دوره 52 شماره
صفحات -
تاریخ انتشار 2008